Deep Learning Approaches to Automatic Beat Game Chart Generation

by Peter de Blanc + ChatGPT Deep Research
Posted to Adarie (www.adarie.com) on July 22, 2025
Content License: Creative Commons CC0 (No Rights Reserved)


Generating step charts (beatmaps) from audio using deep learning has seen active research and development across various rhythm games. Recent methods typically use neural networks to predict timing (when notes occur) and pattern selection (which actions or positions) for games like DDR/StepMania, Osu!, Beat Saber, and others. Crucially, many systems incorporate difficulty modeling – allowing generation of easier or harder charts – and have been evaluated in game-like settings for playability. Below is a structured survey of notable deep learning–based approaches, organized by game type, with their key features, targeted games, ML techniques, and notes on difficulty and output quality.

Dance Games: DDR & StepMania (4-Panel)

Keyboard Rhythm Games: Osu!mania (4-Key)

VR Rhythm Games: Beat Saber

2D Rhythm Games: Osu! (Standard Mode)

Drum Rhythm Games: Taiko no Tatsujin

Production Systems in Industry

Comparison Summary

The table below summarizes the above projects, highlighting the game targets, AI approaches, difficulty handling, and notable results:

Project (Year) Target Game(s) Approach (Models) Difficulty Support Output Quality / Notes
Dance Dance Convolution (2017) DDR (StepMania) 4-panel CNN + LSTM for timing; conditional LSTM for steps (two-stage) Yes – conditioned on difficulty input Playable DDR charts; user demo with ~3.87/5 satisfaction. First deep learning DDR chart generator, baseline for later work.
Udo et al. (2020–2023) DDR (StepMania) 4-panel CNN/LSTM onset detector + rule-based refinement filter Yes – generates then prunes to target density Multi-level charts with accurate note density for each level. Used reference TPM (notes/min) to match intended difficulty.
Liang et al. (2019) Osu!mania 4-key BLSTM (C-BLSTM) sequence model, “fuzzy” labels for ambiguity Yes – difficulty treated as input feature Improved F-score (0.84) for timing; charts felt more natural than prior work. Focus on supervised PCG for 4-key mode.
Beat Sage (2020) Beat Saber (VR) 2x Neural Nets (CNN/LSTM-style) – one for timing, one for block placement Yes – supports Normal through Expert+ difficulties Production-quality auto-mapper; generated maps often rival community maps in fun for suitable songs. Widely used via web.
DeepSaber (2020) Beat Saber (VR) Multi-input LSTM; action-embedding + MLSTM architecture Yes – difficulty and other features included Research project (thesis). Introduced action “word” embeddings and novel metrics. Showed feasibility of ML for complex VR patterns.
Sypteras “AIsu” (2018) Osu! standard (click circle) CNN classifier for hits + heuristic placement (Markov chain) Yes – outputs two fixed difficulties (med & hard) simultaneously First community DL mapper for Osu!. Web demo generated playable maps (required some manual tweaking). Established deep learning baseline for Osu!.
BeatLearning (2024) Osu! standard (expanding to others) Transformer-based sequence generative model (BERT/GPT hybrid) Yes – user selects difficulty; model conditioned on it Ongoing open-source project. Early results show promising beatmaps (without sliders yet). Aims to be general foundation model for rhythm games.
TaikoNation (2021) Taiko (2-pad drum) LSTM RNN focused on pattern sequence generation Partial – focus on pattern quality, difficulty not main focus Produced more human-like note patterns than prior ML approaches. Emphasized congruent patterns (key for playability in Taiko).
GenéLive (2023) Love Live! & similar (mobile, multi-track) CNN + RNN (onset & sym modules) with beat guide and multi-scale CNN enhancements Yes – handles all in-game difficulty modes Deployed in production (KLab). Halved chart design time, charts meet commercial quality. Open-sourced model used in a live game.

Sources: The information and outcomes above are drawn from academic papers, project reports, and developer discussions for each system (see citations). Each represents a milestone toward automating rhythm game content creation with deep learning, balancing musical alignment, difficulty, and fun.

Conclusion

From 2017’s pioneering Dance Dance Convolution to recent industry models like GenéLive! (2023), deep learning methods have rapidly advanced in generating beat game charts from audio. These systems commonly break the problem into predicting when notes should occur (like a musical onset task) and what actions or patterns to perform (a sequence generation task akin to language modeling). They often incorporate difficulty as a parameter – either by conditioning the model on difficulty or by post-processing the output – so that the generated charts can cater to various skill levels. Significantly, evaluations show that modern AI-generated charts can approach human-made quality: e.g. players were sometimes challenged to distinguish AI maps in terms of playability, and AI assistance is already speeding up professional chart design. While not every generated level is perfect, the trend is clear: leveraging CNNs, RNNs, and transformers (even diffusion models in latest attempts) has made automatic chart generation a practical reality. Ongoing work is improving musical structure awareness, pattern naturalness, and cross-game generalization, moving these tools ever closer to production-ready content creation across a variety of rhythm games. The convergence of academic research, open-source community projects, and commercial adoption suggests a bright future for AI-driven beatmap generation, where players can enjoy “infinite” new levels for their favorite songs at the click of a button.